CP-310090 Stunnel lib: Expose unix socket path for TLS proxy#6886
CP-310090 Stunnel lib: Expose unix socket path for TLS proxy#6886changlei-li wants to merge 4 commits intoxapi-project:feature/trusted-certsfrom
Conversation
Add a module UnixSocketProxy in stunnel lib to provide a unix socket
path that can proxy TLS. This can offer a unified mechanism for
differnt users.
Stunnel listens on the unix socket path, accepts the connection
from local request then forwards to remote host and port with TLS.
The certificate checking in TLS connection can be done by stunnel
with the new trusted-certs implementation.
Two set of APIs are provided:
1. long-running stunnel proxy for that the user want to use it
multi-times and handle the proxy lifecycle itself.
```OCaml
let stunnel_proxy =
Stunnel.UnixSocketProxy.start ~verify_cert ~remote_host ~remote_port ()
in
match stunnel_proxy with
| Error e -> (* handle error *)
| Ok proxy_handle ->
let socket_path = Stunnel.UnixSocketProxy.socket_path proxy_handle in
(* use socket_path with HTTP clients *)
...
Stunnel.UnixSocketProxy.diagnose proxy_handle |> function
| Ok () -> (* all good *)
| Error err -> (* handle connection errors *)
...
Stunnel.UnixSocketProxy.stop proxy_handle (* clean up when done *)
```
2. short-lived stunnel proxy for that the user just want to use
one-shot with auto cleanup.
```OCaml
Stunnel.UnixSocketProxy.with_proxy ~verify_cert ~remote_host ~remote_port
(fun proxy_handle ->
let socket_path = Stunnel.UnixSocketProxy.socket_path proxy_handle in
(* use socket_path with HTTP clients *)
...
Stunnel.UnixSocketProxy.diagnose proxy_handle)
...
)
```
Signed-off-by: Changlei Li <changlei.li@cloud.com>
Currently, the verify_error relies on "certificate verify failed" and "No certificate or private key specified" in the stunnel log file. In fact, "No certificate or private key specified" is a normal log for stunnel_proxy. It happens on stunnel configuration fail with verbose log enabled. We can remove it and it is covered by "Configuration failed". For "certificate verify failed", it is a indicator for certificate verify fail, but the detail reasons is in previous lines like "CERT: Pre-verification error: unable to get local issuer certificate" "CERT: Subject checks failed". So the "CERT: " line is collected, if "certificate verify failed" is found, the details can be raised out as reason. Signed-off-by: Changlei Li <changlei.li@cloud.com>
In long time running proxy, every time to call diagnose need to read entire the stunnel log. It is inficient. Record the last checked position so we can only check the new log. Signed-off-by: Changlei Li <changlei.li@cloud.com>
Signed-off-by: Changlei Li <changlei.li@cloud.com>
| (** Stop a running stunnel proxy and clean up resources. | ||
| This kills the stunnel process and removes the socket and log files. *) | ||
|
|
||
| val diagnose : t -> (unit, Stunnel_error.t) result |
There was a problem hiding this comment.
Is this an expensive operation that exists mostly for debugging? It seem unusual what we rely on a log file. If this operation should be used sparingly, it would be good to mention thus,
There was a problem hiding this comment.
This is the shortcoming to use stunnel. Although stunnel is a reliable tool to proxy TLS, it's hard to get the error like the native ssl lib. There is no programmatic API or formatted error code in stunnel. While replacing stunnel in our repo is really a big project. So it is the only way for us to get certificate checking error via stunnel log.
I'm clear about the fragility, so I create the new file stunnel_log_scanner to handle this, with real-world stunnel log in unit test.
I don't think the diagnose is a expensive operation, comparing to the network event. When reading log, the input channel uses buffered I/O - doesn't make a system call for each line. It also uses lseek to jump to a specific position (avoids re-reading from beginning).
|
|
||
| (** Stream through lines from a specific position, applying function to each. | ||
| Returns new_position. Stops early on Error. *) | ||
| let stream_from_position (filepath : string) (start_pos : int) |
There was a problem hiding this comment.
Logfiles get rotated, renamed, and compressed. How does this interact with this module?
There was a problem hiding this comment.
Don't worry about it. The log to check is not the secure.log in /var/log. It's in forkexecd data dir(stunnel process is created by forkexecd) when stunnel is running. So no rotate, rename and compress.
| proxy_pid: pid | ||
| ; proxy_socket_path: string | ||
| ; proxy_logfile: string | ||
| ; mutable last_checked_position: int |
There was a problem hiding this comment.
The in_channel remembers the position already. Could the ic be used here directly, instead of using a integer for the position and re-open the log file at each time of checking the log?
There was a problem hiding this comment.
Good idea. Let me try.
| let _ = Unix.lseek fd start_pos Unix.SEEK_SET in | ||
| let ic = Unix.in_channel_of_descr fd in | ||
| let rec loop () = | ||
| match input_line ic with |
There was a problem hiding this comment.
It's worth mentioning that the input_line may get a partial line when the stunnel pauses writing the remaining part of the line; and hence the checker may miss the signature forever.
There was a problem hiding this comment.
Yes. I have considered this issue. On most normal cases, input_line read untill the new line. In rare case, it reads to End_of_file, but the line is patially flush to the file. Generally the rare case should be caused by stunnel crash which can't be handled.
From the other hand, at the moment the user call diagnose, if the log file ends at "certificate veri" with no new line, the certificate error is actually ignored. But in the user scenario, the diagnose only be called after their network event fails once and then return. Even we read from the start of the log file, it doesn't help the case.
So I think we needn't consider the rare case.
| max_retries start_pos = | ||
| let rec check ~max_retries cnt start_pos = | ||
| match stream_from_position logfile start_pos check_line with | ||
| | End new_pos when cnt <= max_retries -> |
There was a problem hiding this comment.
For example, when max_retries is 2, it checks 4 times.
check 0
-> 0 <= 2 -> check 1
-> 1 <= 2 -> check 2
-> 2 <= 2 -> check 3
-> 3 <= 2 -> stop
| type log_line_status = Continue | LineFound | LineError of Stunnel_error.t | ||
|
|
||
| type log_scan_result = | ||
| | End of int | ||
| | ScanError of Stunnel_error.t * int | ||
| | ScanFound of int |
There was a problem hiding this comment.
From the matching's point of view, there would be only two results: Found (Some ...) or Not_found (None).
And for the Found, the caller can determine it's an expected good result or an error. Something like:
type scan_result = (string, string) Result.t option
let find ~sigs ~box line =
sigs
|> List.find_map (
(fun affix ->
if Astring.String.is_infix ~affix line then
Some (box line)
else
None
)
let check_good good = find ~sigs:good ~box:Result.ok
let check_bad bad = find ~sigs:bad ~box:Result.error
let check_both ~good ~bad line = (check_good good) >>= (check_bad bad)
let check_log ~ic ~line_checker ~no_match=
let rec loop () =
match input_line ic with
| line -> (
match line_checker line with
| None ->
loop ()
| Some r ->
r
)
| exception End_of_file ->
no_match
in
loop ()
Add a module UnixSocketProxy in stunnel lib to provide a unix socket
path that can proxy TLS. This can offer a unified mechanism for
differnt users.
Stunnel listens on the unix socket path, accepts the connection
from local request then forwards to remote host and port with TLS.
The certificate checking in TLS connection can be done by stunnel
with the new trusted-certs implementation.
Two set of APIs are provided:
multi-times and handle the proxy lifecycle itself.
one-shot with auto cleanup.